šŸ›ƒ

Spark Tasking v3

This document describes the third iteration on designing deterministic task assignment to Spark checker nodes. Prior work:

Motivation

ATM, checker nodes can choose the tasks in whatever way they want. This allows SPs to collude, e.g. by checking only their deals.

The v3 design makes the task→node assignment deterministic, nodes can no longer choose the tasks.

This improvement is a pre-requisite for further protocol improvements:

  • Ensure most committees are large enough to give us confidence in majority being honest
  • Group measurements by the task (the committee), find & accept the majority, reject minority results.

Requirements

  1. Make Sybil attacks difficult/expensive
    We want to limit how many checker instances a single party operates by disincentivizing operating too many instances.
  1. Make SP collusion difficult
    We want to make it difficult to run colluding SPARK nodes that will limit the retrieval checks only to a given SP or perhaps a small fixed set of CIDs.
    Essentially, the checker nodes must not be able to self deal (choose the tasks to perform).
  1. (Mostly) uniform committee sizes
    Our ā€œProof of Retrievalā€ solution requires a committee where an honest majority can be formed.
    At the end of each round, we want all committees to have roughly the same size. The size must be large enough to give us confidence that the majority is honest. (A committee is a list of nodes that measured the given task - the pair (CID, minerId).)
    šŸ¤”
    One thing to take into account is that honest majority is not always enough given checkers can experience false positives due to network issues etc.
  1. No membership service
    The algorithm must assign tasks to checker nodes in a probabilistic manner that does not require us to maintain a list of online nodes (a membership service).
  1. Rate Limiting
    We want to rate-limit how many jobs are performed by each checker.
    • Without rate limiting & fraud detection, a fraudulent node can short-circuit job execution by skipping the retrieval and reporting fake results instead. This allows nodes to cheaply report many completed jobs, which is later translated to disproportionally large impact & reward.
    • Even with fraud detection in place, if we don’t rate-limit the checks, then nodes on fast, unmetered internet connections can tweak the SPARK checker code to perform more checks than we designed for. This can put too much pressure on SPs. Plus, the operator will get a larger portion of rewards, which is unfair to honest nodes.
    • We want fairness - i.e. guarantees that no party can amplify their fraction of the total reports - otherwise even if I control 1% of the nodes, if those are all ā€œfastā€ nodes, I could still do 50% of total logs potentially, so it’s about sybils/scare resources.
    • We must avoid DDoSing SPs and spread the load across all SPs. If the entire SPARK network checks the same (CID, SP) pair, we will effectively DDoS the poor SP.
  1. IPv4 as a scarce resource
    The algorithm must disincentivise operators running more than one node in a given IPv4 /24 subnet.
  1. Cheap to verify
    The Station network has ~30k nodes, ~4k unique participant addresses. Each round, we test 1k different tasks and collect ~700k measurements. The tasking algorithm must allow the fraud-detection service to efficiently verify whether a measurement was submitted for a task that’s valid for the given node and round.

Out of the scope (for now)

  • How to choose a list of tasks for the current round from all deals that are expected to be retrievable.
  • How to maintain the number of jobs performed per round at a constant rate that’s independent of the network size.

Current status (May 2024)

  • For each round, the Round Tracker service picks 1000 tasks to choose from.
  • Checker nodes pick tasks at random (or at their will) from that list.
  • IPv4 as a scarce resource + rate limit: The Evalute service accepts only the first 15 measurements from each IPv4 /24 subnet, ignoring duplicate measurements for the same task.

Available inputs

  • DRAND beacon - 32 bytes (hex encoded as 64 chars) on the default drand chain
  • Participant address in ETH 0x format, not unique; controlled by the checker node operator
  • Station ID - a public key, 44 bytes (hex encoded as 88 chars), unique; controlled by the checker node operator
  • A list of tasks for the current round. A task is defined as (cid, miner_id).

Proposed design

Used XOR-based metric to find T tasks that are closest to the Station node.

Note: Station IDs are 44 bytes long and SHA256 produces digests that are 32 bytes long. When you do XOR between a 44-byte value (station ID) and a 32-byte value (hash), you will get a value with the first 12 bytes always set to the same value. I.e. for each nodeKey, the dist values will always start with the same 12 bytes. While this isn’t a problem for picking T closest task, I find it confusing and therefore prefer to use only the last 32 bytes from the Station ID value.

  1. For each task that’s valid for the round, calculate its key by hashing the task and the randomness for the current Spark round. The way the hash function works, these hash values will be randomly & uniformly distributed in the entire space of 2^256 values. By including the randomness in the inputs, we will get a different mapping in each round.
    taskKey = sha256([cid, miner_id, drand].join('\n'))
  1. For each node, use the last 32 bytes (64 hex characters) from the station_id as the key.
    nodeKey = Buffer.from(station_id.slice(-64), 'hex')

    UPDATE
    Let’s hash the station id instead, see

    nodeKey = sha256(Buffer.from(station_id))
  1. Find K tasks closest to the node using XOR as the distance metric.
    dist = taskKey ^ nodeKey

    This can be implemented using a Priority Queue in O(T * log(K)) time where T is the number of tasks for this round and K is the number of tasks each node can perform. Note that T >> K - currently T=1000 and K=15 .

IPv4 as a scarce resource

The fraud detection algorithm already rewards only 15 measurements from each IPv4 /24 subnet. The tasking algorithm described above is orthogonal and does not require any changes to be made in this algorithm.

Verification

The fraud detection component needs to:

  • Calculate T task keys (Call T times the sha256 algorithm)
  • Assuming N unique nodes, calculate N sets of tasks assigned to each node. (Call P times the k-closest-element algorithm on a constant data set).
  • For each measurement, check that it corresponds to a valid task for the node that submitted it.

Attack vectors

Self-dealing

A malicious actor can pick the tasks they want to check and then choose such a Station ID value that makes the task belong to the k-closest tasks for that Station ID.

We can make this attack less attractive by giving more weight to measurements contributed by Station IDs with longer history. This weight can be applied to rewards but also when aggregating measurements to produce the Retrieval Success Rate and other metrics.

DoS

A malicious actor can provide a unique station ID for each measurement, forcing the fraud detection component to run the k-closest-element algorithm for each measurement.

This remains an open problem.

Performance

When using the heap-based k-closest-element implementation to build the list of K closest tasks for each node, the duration of the evaluation of a single round increases by 50% from 6 seconds to 9 seconds.

I think we can further improve the performance of this part by rewriting the algorithm in Rust & WebAssembly, but I propose to defer such optimisations until later.

Committee sizes

Using the measurements submitted for the round 10788, if we assume that all station instances keep the same inet_group for the entire duration of this round, and submit one measurement for each valid task, we will get the following distribution. The last column shows the current distribution when nodes pick their tasks randomly.

Committees created by deterministic tasking

METRICNEXT GENCURRENT
[REMOVED: measurements_max]1783363
[REMOVED: measurements_mean]916176
[REMOVED: measurements_min]47329
[REMOVED: measurements_p1]50437
[REMOVED: measurements_p5]57742
[REMOVED: measurements_p10]64744
[REMOVED: measurements_p50]945158
[REMOVED: measurements_p90]1115316
[REMOVED: measurements_p95]1265324
[REMOVED: measurements_p99]1656342
[REMOVED: nodes_max]1045322
[REMOVED: nodes_mean]635162
[REMOVED: nodes_min]38029
[REMOVED: nodes_p1]40436
[REMOVED: nodes_p5]44541
[REMOVED: nodes_p10]48644
[REMOVED: nodes_p50]653147
[REMOVED: nodes_p90]747287
[REMOVED: nodes_p95]809295
[REMOVED: nodes_p99]1006310
[REMOVED: participants_max]296183
[REMOVED: participants_mean]15385
[REMOVED: participants_min]8711
[REMOVED: participants_p1]9317
[REMOVED: participants_p5]10020
[REMOVED: participants_p10]10922
[REMOVED: participants_p50]15776
[REMOVED: participants_p90]185155
[REMOVED: participants_p95]202161
[REMOVED: participants_p99]279170
[REMOVED: subnets_max]654283
[REMOVED: subnets_mean]440144
[REMOVED: subnets_min]28728
[REMOVED: subnets_p1]30335
[REMOVED: subnets_p5]33440
[REMOVED: subnets_p10]35842
[REMOVED: subnets_p50]448135
[REMOVED: subnets_p90]500252
[REMOVED: subnets_p95]532258
[REMOVED: subnets_p99]633269


How to read the tables above

Compare the difference between min/mean/max numbers.

subnets

Before the change, the smallest committee had measurements from 28 different subnets only. After the change, all committtes have at least 287 subnets.

Let’s look at the min-p10-p90-max ranges, and also normalise the values so that the mean average is the same for ā€œbeforeā€ and ā€œafterā€ data:

minp10p90max
After (raw)287358500654
After (norm)93117163207
Before2842252283

It’s clear that the new algorithm reduces the variance in the number of participants per committee.

participants

Before the change, the smallest committee had measurements from 11 different participants only. After the change, all committtes have at least 87 participants.

Let’s look at the min-p10-p90-max ranges, and also normalise the values so that the mean average is the same for ā€œbeforeā€ and ā€œafterā€ data:

minp10p90max
After (raw)87109185296
After (norm)4861103164
Before1122155183

It’s clear that the new algorithm reduces the variance in the number of participants per committee.

measurements

The number of dummy (next-gen) measurements is 2.6x higher than the number of real measurements submitted in the inspected round. When we normalize the percentile values coming from the deterministic tasking algorithm, we get the following table:

minp10p90max
After (raw)47364711151783
After (norm)91124214343
Before2944316363

Again, the new algorithm spread the measurements more evenly and reduces the number of committees that have very few or too many measurements compared to the average committee.

Conclusion

The proposed design has acceptable performance and significantly improves the quality of committees.

To have enough confidence that a majority of each committee is honest, we need each committee to have at least 40-50 participants. The current algorithm does not meet that requirement for at least 10% of committees. The proposed algorithm seems to meet this requirement in all committees.

Update

I realised that using the Station ID, which is a public key, may not give us uniform distribution, because there are no guarantees that public keys are uniformly distributes. (Or are there?)

A better solution is to hash the Station ID using the same hashing function we use for building the task key.

Comparison of this new version to the algorithm described above:

minp10p90max
participants v187109185296
participants v277111185290
subnets v1287358500654
subnets v2286353498652
measurements v147364711151783
measurements v247065711281855

Interestingly enough, the more ā€œcorrectā€ algorithm gives us slightly worse committtees. Considering the difference is very small, I propose to hash the Station ID to get more confidence in the correctness of our algorithm.